Using large clinical corpora for query expansion in text-based cohort identification
نویسندگان
چکیده
منابع مشابه
Using large clinical corpora for query expansion in text-based cohort identification
In light of the heightened problems of polysemy, synonymy, and hyponymy in clinical text, we hypothesize that patient cohort identification can be improved by using a large, in-domain clinical corpus for query expansion. We evaluate the utility of four auxiliary collections for the Text REtrieval Conference task of IR-based cohort retrieval, considering the effects of collection size, the inher...
متن کاملQuery Architecture Expansion in Web Using Fuzzy Multi Domain Ontology
Due to the increasing web, there are many challenges to establish a general framework for data mining and retrieving structured data from the Web. Creating an ontology is a step towards solving this problem. The ontology raises the main entity and the concept of any data in data mining. In this paper, we tried to propose a method for applying the "meaning" of the search system, But the problem ...
متن کاملOntology-based Query Expansion for Arabic Text Retrieval
The semantic resources are important parts in the Information Retrieval (IR) such as search engines, Question Answering (QA), etc., these resources should be available, readable and understandable. In semantic web, the ontology plays a central role for the information retrieval, which use to retrieves more relevant information from unstructured information. This paper presents a semantic-based ...
متن کاملText Mining Based Query Expansion for Chinese IR
Query expansion has long been suggested as a technique for dealing with word mismatch problem in information retrieval. In this paper, we describe a novel query expansion method which incorporates text mining techniques into query expansion for improving Chinese information retrieval performance. Unlike most of the existing query expansion strategies which generally select indexing terms from t...
متن کاملEfficient Web Crawling for Large Text Corpora
Many researchers use texts from the web, an easy source of linguistic data in a great variety of languages. Building both large and good quality text corpora is the challenge we face nowadays. In this paper we describe how to deal with inefficient data downloading and how to focus crawling on text rich web domains. The idea has been successfully implemented in SpiderLing. We present efficiency ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Journal of Biomedical Informatics
سال: 2014
ISSN: 1532-0464
DOI: 10.1016/j.jbi.2014.03.010